51 research outputs found

    Machine learning for network based intrusion detection: an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data.

    Get PDF
    For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions

    HUMANE internal case study: eVACUATE #1

    Get PDF
    This case study was conducted on 14 December 2015. The purpose was to evaluate the usefulness of the HUMANE approach as perceived by relevant developers (software engineers), and additionally ask if the HUMANE typology facilitates cross-disciplinary understanding. The files included here provide a summary of the analysis and the transcript from a semi-structured focus group

    HUMANE external case study: eVACUATE #2

    Get PDF
    This case study was conducted in September to October 2016 with the purpose of providing an external validation of the HUMANE typology and method. This eVACUATE case-study comprises four different engagements in order to ensure a comprehensive evaluation: a quantitative online survey on the HUMANE design patterns; a quantitative survey on the HUMANE typology used for characterising Human-Machine Networks (HMNs); and two focus groups evaluating the HUMANE method (covering the profiling process, network diagramming, implication analysis, and design pattern approach). A summary of results, along with focus group transcripts, surveys and survey results are included here

    Towards critical event monitoring, detection and prediction for self-adaptive future Internet applications

    No full text
    The Future Internet (FI) will be composed of a multitude of diverse types of services that offer flexible, remote access to software features, content, computing resources, and middleware solutions through different cloud delivery models, such as IaaS, PaaS and SaaS. Ultimately, this means that loosely coupled Internet services will form a comprehensive base for developing value added applications in an agile way. Unlike traditional application development, which uses computing resources and software components under local administrative control, FI applications will thus strongly depend on third-party services. To maintain their quality of service, those applications therefore need to dynamically and autonomously adapt to an unprecedented level of changes that may occur during runtime. In this paper, we present our recent experiences on monitoring, detection, and prediction of critical events for both software services and multimedia applications. Based on these findings we introduce potential directions for future research on self-adaptive FI applications, bringing together those research directions

    Machine learning for network based intrusion detection : an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data

    Get PDF
    For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Jakten pĂĄ Raud den Rame : et studie av makt og samhandling i Saltens yngre jernalder.

    Get PDF
    Avhandlingen tar for seg økonomi, makt og samhandling i yngre jernalder, med fokus på områdene ved og innenfor Saltstraumen. Jeg tar utgangspunkt i sagalitteraturens fortellinger om Olav Tryggvasons møte med Raud den Rame fra Salten. Sagaen fremlegger flere påstander om de politiske og økonomiske forholdene i Salten. Raud sto i følge med Samiske befolkningen i distriktet. Snorre forteller at Raud hadde flere hundre samer i følget sitt som sto til hans disposisjon når han trengte det. Mine problemstillinger leder ut fra disse påstandene. Gjennom det arkeologiske materialet og annet kildetilfang forsøker jeg å besvare tre sentrale spørsmål. Kan man identifisere et maktsenter i dette området, slik det man hører om i sagalitteraturen? Er en samhandling mellom samer og håløyger sporbar i dette området? Hvor mye kan egentlig sagalitteraturen fortelle oss om samfunnet på denne tiden? Jeg argumenterer for at et slikt maktsenter kan spores gjennom det arkeologiske materialet, og ved hjelp av annet kildetilfang mener jeg å avgrense betydelig hvor dette maktsenteret var lokalisert. Det viser seg at flere av påstandene i sagalitteraturen har støtte i det arkeologiske materialet. Jeg fremhever at selv om man ofte ønsker å plassere lokaliteter og funn i etniske og kulturelle båser, kan man ikke alltid gjøre det uten videre. Om Raud den Rame har eksistert eller er et produkt av lengre tids forvrengning av muntlige sagn og sagaforfatternes personlige bidrag til disse sagnene kan man aldri vite. Allikevel forteller historien om Raud den Rame oss mye om både sosiopolitiske, økonomiske og kultiske forhold i jernalderens Salten. Relasjonene mellom samer og håløyger fremheves sterkt i Snorres tekst. Dette mener jeg er et bevisst valg fra sagaforfatteren. Jeg argumenterer for at håløygenes kontakter med den samiske befolkningen har stukket langt dypere enn rene økonomiske hensyn. Høvdingenes kontakter med samer har hatt en viktig religiøs og sosiopolitisk betydning. Demoniseringen av det samiske folk, og samisk trolldom var kanskje et viktig trekk for å komme de gamle skikkene til livs, og gjennom å plassere en fiende av kongen og kirken i en allianse med dem, understreke de politiske poengene

    Business Process Risk Management and Simulation Modelling for Digital Audio-Visual Media Preservation.

    Get PDF
    Digitised and born-digital Audio-Visual (AV) content presents new challenges for preservation and Quality Assurance (QA) to ensure that cultural heritage is accessible for the long term. Digital archives have developed strategies for avoiding, mitigating and recovering from digital AV loss using IT-based systems, involving QA tools before ingesting files into the archive and utilising file-based replication to repair files that may be damaged while in the archive. However, while existing strategies are effective for addressing issues related to media degradation, issues such as format obsolescence and failures in processes and people pose significant risk to the long-term value of digital AV content. We present a Business Process Risk management framework (BPRisk) designed to support preservation experts in managing risks to long-term digital media preservation. This framework combines workflow and risk specification within a single risk management process designed to support continual improvement of workflows. A semantic model has been developed that allows the framework to incorporate expert knowledge from both preservation and security experts in order to intelligently aid workflow designers in creating and optimising workflows. The framework also provides workflow simulation functionality, allowing users to a) understand the key vulnerabilities in the workflows, b) target investments to address those vulnerabilities, and c) minimise the economic consequences of risks. The application of the BPRisk framework is demonstrated on a use case with the Austrian Broadcasting Corporation (ORF), discussing simulation results and an evaluation against the outcomes of executing the planned workflow

    The development of a web-based application to predict the risk of gastrointestinal cancer in iron deficiency anaemia; the IDIOM app

    Get PDF
    To facilitate the clinical use of an algorithm for predicting the risk of gastrointestinal malignancy in iron deficiency anaemia—the IDIOM score, a software application has been developed, with a view to providing free and simple access to healthcare professionals in the UK. A detailed requirements analysis for intended users of the application revealed the need for an automated decision-support tool in which anonymised, individual patient data is entered and gastrointestinal cancer risk is calculated and displayed immediately, which lends itself to use in busy clinical settings. Human-centred design was employed to develop the solution, focusing on the users and their needs, whilst ensuring that they are provided with sufficient details to appropriately interpret the risk score. The IDIOM App has been developed using R Shiny as a web-based application enabling access from different platforms with updates that can be carried out centrally through the host server. The application has been evaluated through literature search, internal/external validation, code testing, risk analysis, and usability assessments. Legal notices, contact system with research and maintenance teams, and all the supportive information for the application such as description of the population and intended users have been embedded within the application interface. With the purpose of providing a guide of developing standalone software medical devices in academic setting, this paper aims to present the theoretical and practical aspects of developing, writing technical documentation, and certifying standalone software medical devices using the case of the IDIOM App as an example

    DAVID D2.2: Analysis of loss modes in preservation systems

    No full text
    This is a report on the way in which loss and damage to digital AV content occurs for different content types, AV data carriers and preservation systems.Three different loss modes have been identified, and each has been analysed in terms of existing solutions and longterm effects. This report also includes an in-depth treatment of format compatibility (interoperability issues), format resilience to carrier degradation and format resilience to corruption
    • …
    corecore